Automatic Categorization of Google Search Results for Medical Queries using JDI
نویسندگان
چکیده
The web has become the primary source of medical information for consumers and health professionals. It is quite common for people to “Google” for information related to a medical topic. But the problem remains that as the number of documents increases on the web, the difficulty in quickly locating the best documents increases. Classifying results into meaningful categories, helps guide users to the most relevant set of results. Journal Descriptor Indexing (JDI) is a novel approach to fully automatic indexing. In this paper we explore the feasibility of using JDI to organize Google search results for medical queries into meaningful categories. For our experiments, we used JDI in combination with a set of heuristics to automatically categorize the search results for 5 query terms. Three independent reviewers reviewed and evaluated the automatic categorization for 3 documents for each query term. The results clearly suggest that this method offers promise. Additional work for improving the categorization as well as to determining whether a term is medical or not is also discussed.
منابع مشابه
Comparing a rule-based versus statistical system for automatic categorization of MEDLINE documents according to biomedical specialty
Automatic document categorization is an important research problem in Information Science and Natural Language Processing. Many applications, including Word Sense Disambiguation and Information Retrieval in large collections, can benefit from such categorization. This paper focuses on automatic categorization of documents from the biomedical literature into broad discipline-based categories. Tw...
متن کامل'surfing for knowledge' finding semantically similar Web clusters
In this paper we present our technique for finding semantically similar clusters within web documents obtained from a set of queries retrieved from the Google search engine. This technique utilizes a clustering algorithm based on previous Latent Semantic Analysis (LSA) work pioneered by Deerwester. In this paper we demonstrate how by using our clustering algorithm we can resolve ambiguities pre...
متن کاملAutomatic Acquisition of Synonyms Using the Web as a Corpus
We present an original algorithm for automatic acquisition of synonyms from text. The algorithm measures the semantic similarity between pairs of words by comparing their local contexts extracted from the Web by series of queries against the Google search engine. The results show 11pt average precision of 63.16%.
متن کاملReal time search on the web: Queries, topics, and economic value
Real time search is an increasingly important area of information seeking on the Web. In this research, we analyze 1,005,296 user interactions with a real time search engine over a 190 day period. Using query log analysis, we investigate searching behavior, categorize search topics, and measure the economic value of this real time search stream. We examine aggregate usage of the search engine, ...
متن کاملAnalysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type
Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008